巴西专利BR112013013944B1 Method and system for transferring digital video content

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
VARIABLE BITS VIDEO STREAM FOR ADAPTIVE CONTINUOUS FLOW. One embodiment of the present invention discloses a technique for adapting playback bitrates in a content delivery system based on scene complexity of the video content as well as network conditions and other performance factors. A scene complexity map of video content indicates the levels of complexity of different scenes within the video content. Using Scene Complexity Map, can transfer scenes from lower scene complexity content player levels of video streams encoded to lower bitrates to manage the bandwidth consumed to transfer the video content and enable transfer of scenes of higher complexity levels of video streams encoded for higher bit rates.
公开号:BR112013013944B1
申请号:R112013013944-7
申请日:2011-12-06
公开日:2022-01-25
发明作者:Neil D. Hunt
申请人:Netflix, Inc；
IPC主号:

专利说明:

CROSS REFERENCE TO RELATED ORDERS
This application claims benefit of United States patent application 12/961,375, filed December 6, 2010, which is incorporated herein by reference. BACKGROUND OF THE INVENTION Field of Invention
Embodiments of the present invention generally pertain to digital media and more specifically to variable bit video streams for adaptive streaming. Description of Related Technique
Digital content delivery systems conventionally include a content server, a content player, and a communications network connecting the content server to the content player. The content server is configured to store digital content files, which can be transferred from the content server to the content player. Each digital content file corresponds to a specific identifying title, such as “Gone with the Wind”, which is familiar to a user. The digital content file typically includes sequential content data, organized according to playback chronology, and may comprise audio data, video data, or a combination thereof.
The content player is configured to transfer and play a file of digital content, in response to a user request selecting the title for playback. The process of playing the digital content file includes decoding and rendering audio and video data to an audio signal and a video signal, which can drive a display system having a speaker subsystem and a video subsystem. . Playback typically involves a technique known in practice as “streaming,” whereby the content server sequentially streams the digital content file to the content player, and the content player plays the digital content file while the content data comprise the digital content file are received. To account for variable latency and bandwidth within the communications network, a content staging queues the incoming content data ahead of the content data actually being played. During times of network congestion, which results in less available bandwidth, less content data is added to the content buffer, which can be depleted as content data is being dequeued to support playback on a certain playback bitrate. However, during times of high network bandwidth, the content staging is replenished and additional staging time is added until the content staging is generally full again. In practical systems, content staging can queue content data corresponding to a period of time ranging from seconds to more than a minute.
Each digital content file stored on the content server is typically encoded for a specific playback bit rate. Before starting playback, the content player can measure the available bandwidth of the content server and select a digital content file having a bit rate that can be supported by the measured available bandwidth. To maximize playback quality, a digital content file with the highest bitrate not exceeding the measured bandwidth is conventionally selected. To the extent that the communications network can provide adequate bandwidth to transfer the selected digital content file while satisfying the bit rate requirements, playback proceeds satisfactorily. In practice, however, the bandwidth available on the communications network is constantly changing as different devices connected to the communications network perform independent tasks.
To contain the variability of network conditions, adaptive streaming can be implemented where, for each title, there are multiple video streams having different bitrates. As network conditions vary, the content player may switch between video streams according to network conditions. For example, video data can be transferred from video streams encoded at higher bit rates when network conditions are good, and when network conditions deteriorate, subsequent video data can be transferred from video streams encoded at higher bit rates. lower bits.
A problem arises with implementing an adaptive streaming solution when video streams are encoded using a variable bit rate (VBR) technique. In a VBR video stream, to optimize the bandwidth utilization or space used by a file, different video scenes are encoded based on the complexity of those video scenes. A low-complexity scene is encoded to a lower bitrate to “save” bits for scenes that have a higher complexity. The average bitrate across a VBR video stream is thus not reflective of the bitrate of a particular scene within the VBR video stream. This poses a problem when implementing adaptive streaming because the content player selects an encoded video stream based on the average bit rate, but the specific portions of video data transferred from the encoded video stream may be encoded at a rate bitrate that is much higher or much lower than the average bitrate. In such a scenario, switching between encoded video streams may not be appropriate or effective, thus reducing overall playback quality.
As indicated above illustrates, what is needed in the art is an approach to transferring the digital content to a content player based on the scene complexity of the digital content. SUMMARY OF THE INVENTION
One embodiment of the present invention discloses a method for adaptively transferring digital video content. The method comprises the steps of receiving a scene complexity map associated with the digital video content and specifying a level of complexity associated with each part of the digital video content, identifying a plurality of encoded video streams associated with the video content. digital video stream, where each encoded video stream is associated with a different bitrate and includes a portion encoded for the different bitrate for each portion of the digital video content, determine, based on the scene complexity map, the level of complexity associated with a first part of the digital video content, dynamically determine during playback of a different part of the digital video content, based on the level of complexity associated with the first part of the digital video content, a first encoded video stream included in the plurality of encoded video streams from which to transfer a first encoded part corresponding to the first first part of the digital video content, and transferring for playback the first encoded part of the first encoded video stream to a temporary storage of content residing within a content player device.
An advantage of the disclosed technique is that an encoded stream of variable bits is dynamically generated by the content player at playback time by selecting parts of video data from different constant bitrate encoded streams based on the levels of complexity of the parts of video data. A technique like this allows you to optimize the playback of video data and generate the highest playback quality video stream based on current conditions and scene complexities. BRIEF DESCRIPTION OF THE DRAWINGS
In order that the manner in which the above-reported features of the present invention may be understood in detail, a more particular description of the invention, summarized above, will be made with reference to embodiments, some of which are illustrated in the accompanying drawings. It is to be noted, however, that the accompanying drawings illustrate only typical embodiments of this invention and, therefore, are not to be considered as limiting its scope, as the invention may allow for other equally effective embodiments.
Figure 1 illustrates a content delivery system configured to implement one or more aspects of the present invention.
Figure 2 is a more detailed view of the encryption server of Figure 1, according to an embodiment of the invention.
Figure 3 is an illustration of a scene complexity map generated by the complexity map generator of Figure 2, in accordance with an embodiment of the invention.
Figure 4 is an illustration of different video streams generated by the video stream encoder, according to an embodiment of the invention.
Figure 5 is a more detailed view of the content player of Figure 1, according to an embodiment of the invention.
Fig. 6 is a flowchart of method steps for selecting a next scene for playing one of a plurality of video streams based on scene complexity, in accordance with an embodiment of the invention. DETAILED DESCRIPTION
In the following description, numerous specific details are set forth to provide a more complete understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention can be practiced without one or more of these specific details. In other cases, well known features have not been described in order to avoid obscuring the present invention.
Figure 1 illustrates a content delivery system 100 configured to implement one or more aspects of the invention. As shown, the content delivery system 100 includes an encryption server 102, a communications network 104, a content delivery network (CDN) 106, and a content player 108.
Communications network 104 includes a plurality of network communications systems, such as routers and switches, configured to facilitate data communication between encryption server 102, CDN 106, and content player 108. Those skilled in the art will recognize that there are many practically feasible techniques for building the communications network 104, including technologies practiced in implementing the well-known Internet communications network.
Encoding server 102 is a computer system configured to encode video streams associated with digital content files for adaptive streaming. The encoding workflow for encoding the video streams for adaptive streaming is described in more detail below with reference to Figures 2 and 3. The content delivery system 100 may include one or more encoding servers 102, where each encoding server 102 is configured to perform all functions necessary to encode video streams or where each encoding server 102 is configured to perform a particular function required to encode video streams. Digital content files including encoded video streams are retrieved by CDN 106 via communications network 104 for distribution to content player 108.
The CDN 106 comprises one or more computer systems configured to service transfer requests for digital content files from the content player 108. The digital content files may reside on a mass storage system accessible to the computer system. The mass storage system may include, without limitation, direct-attached storage, network-attached file storage, or network-attached block-level storage. Digital content files can be formatted and stored on the mass storage system using virtually any feasible technique. A data transfer protocol, such as the well-known hypertext transfer protocol (HTTP), can be used to transfer digital content files from content server 106 to content player 108.
The content player 108 may comprise a computer system, a signal converter apparatus, a mobile device such as a mobile phone, or any other technically feasible computing platform that has network connectivity and has coupled or includes a display device and speaker device to present video frames, and generate acoustic output, respectively. The content player 108 is configured for adaptive streaming, i.e., to transfer units of a video stream encoded for a specific playback bit rate, and to switch to transfer subsequent units of a video stream encoded for a specific bit rate. playback based on prevailing bandwidth conditions within the communications network 104. As available bandwidth within the communications network 104 becomes limited, the content player 108 may select an encoded video stream for a lower playback bitrate. As available bandwidth increases, a video stream encoded for a higher playback bit rate can be selected.
Although, in the above description, the content delivery system 100 is shown with a content player 108 and a CDN 106, those skilled in the art will recognize that the architecture of Figure 1 considers only an exemplary embodiment of the invention. Other embodiments may include any number of content players 108 and/or CDNs 106. Thus, Figure 1 is not intended to limit the scope of the present invention in any way.
Figure 2 is a more detailed illustration of the encryption server 102 of Figure 1, in accordance with an embodiment of the invention. As shown, the encryption server 102 includes a central processing unit (CPU) 202, a system disk 204, an input/output device (I/O) interface 206, a network interface 208, an interconnect 210 and a 212 system memory.
CPU 202 is configured to retrieve and execute programming instructions stored in system memory 212. Similarly, CPU 202 is configured to store application data and retrieve application data from system memory 212. Interconnect 210 is configured to promote transmission of data, such as programming instructions and application data, between CPU 202, system disk 204, input/output device interface 206, network interface 208, and system memory 212. The interface of input/output devices 206 is configured to receive input data from input/output devices 222 and transmit the input data to CPU 202 via interconnect 210. For example, input/output devices 222 may comprise a or more keys, a keyboard, and a mouse or other pointing device. Input/output device interface 206 is also configured to receive output data from CPU 202 via interconnect 210 and transmit output data to input/output devices 222. System disk 204, such as a unit hard disk drive or flash memory storage drive or the like, is configured to store non-volatile data such as encoded video streams. The encoded video streams can then be retrieved by the CDN 106 via the communications network 104. The network interface 218 is coupled to the CPU 202 via the interconnect 210 and is configured to transmit and receive data packets via the data network. communications 104. In one embodiment, the network interface 208 is configured to operate in accordance with the well-known Ethernet standard.
System memory 212 includes software components that include instructions for encoding one or more video streams associated with a specific content title for adaptive streaming. As shown, these software components include a complexity map generator 214, a video stream encoder 216, and a sequence header index (SHI) generator 218.
For a particular video stream, the complexity map generator 214 analyzes the video stream to determine the complexity of the video content within different parts of the video stream (referred to in this document as "scenes"). A complex scene is typically a scene that changes significantly frame by frame, for example a car crash scene in an action movie. Conversely, a simple scene is typically a scene with little frame-by-frame changes, for example, a scene of a still body of water at night. Complexity generator 214 can analyze the video stream based on predetermined heuristic information. Based on the analysis, the complexity map generator 214 generates a scene complexity map which is described in more detail below with respect to figure 3.
Video stream encoder 216 performs encoding operations to encode a video stream to a specific playback bit rate such that the encoded video stream conforms to a particular video encoder/decoder standard, such as VC1, and is configured for adaptive streaming. In an alternative embodiment, the video stream may be encoded to conform to a different video encoder/decoder standard such as MPEG or H.264. In operation, for a particular video stream, the video stream encoder 216 encodes the video stream to different constant bit rates to generate multiple encoded video streams, each encoded video stream associated with a different constant bit rate and thus having a different quality. An encoded video stream generated by the video stream encoder 216 includes a sequence of groups of pictures (GOPs), each GOP comprising multiple frames of video data.
SHI generator 218 generates a sequence header index associated with each encoded video stream. To generate the sequence header index, the SHI generator 218 first searches the encoded video stream for the key frames associated with the different GOPs included in the encoded video stream. Keyframes can be located by the SHI generator 218 based on sequence start codes specified in sequence headers included in keyframes. For the GOP associated with each of the identified key frames, the SHI generator 218 defines a switch point within the sequence header index that stores (i) a data packet number that identifies the data packet that includes the key frame associated with the GOP and (ii) the reproduction offset associated with the GOP. Again, the playback offset associated with the GOP is determined based on the location of the GOP in the sequence of GOPs included in the encoded video stream.
Encoding server 102 can generate multiple encoded video streams associated with the same content title and encoded for different playback bit rates in the manner described above. The encoding process described in this document ensures that, across the different encoded video streams, the GOPs are associated with the same playback time slot and that corresponding GOPs across the different encoded video streams are associated with the same playback offsets. Therefore, each switch point defined in a sequence header included in one of the encoded video streams associated with a specific content title has a corresponding switch point defined in a sequence header included in each of the other encoded video streams associated with it. with the same content title.
Based on the sequence header indices included in two encoded video streams associated with the same content title, a content player can efficiently switch between encoded video streams by identifying the appropriate switching points in the header indices of sequence. When switching between a currently playing encoded video stream and a new encoded video stream, a content player, such as the content player 108, searches the sequence header index included in the new encoded video stream to locate the points particular switches specifying the play offset associated with the next GOP to be played. The content player can then switch to the new encoded video stream and transfer the GOP stored in the data packet specified at the particular switch points for playback. For example, for encoded video streams where each GOP was associated with a three-second playback time interval, if the first GOP associated with a zero-second playback offset was currently being played, then the next GOP to be played would be associated with the three-second playback offset. In such a scenario, the content player polls the sequence header associated with the new encoded stream for particular switch points by specifying a three-second playback offset. Once the particular switch points were located, the content player would transfer the GOP stored in the data packet specified in the switch points for playback.
In practice, a GOP may include multiple scenes or parts of a scene. For the sake of simplicity, with respect to the present invention, the discussion set out below focuses on particular scenes within an encoded video stream rather than GOPs within the encoded video stream. Although a content player such as content player 108 may switch between different encoded video streams based on the GOP limits defined by the corresponding sequence header indices, the switching process considers the complexities of the different scenes included in the GOP . This switching process is described in additional detail below.
Figure 3 is an illustration of a scene complexity map 302 generated by the complexity map generator 214 of Figure 2, in accordance with an embodiment of the invention. As shown, the scene complexity map 302 specifies the level of complexity of different scenes 304 within a video stream. For example, scene 304(0) and scene 304(4) each have medium scene complexity, scene 304(1) has low scene complexity, and scene 304(2) and scene 304 each (3) it has a high scene complexity. Other embodiments of the scene complexity map 302 are also contemplated by this invention. In alternate embodiments, scene complexity levels may be number-based and/or may be more granular. For purposes of this invention, a scene complexity map 302 specifies a scene complexity level for each scene in a video stream, where a particular scene corresponds to a specific set of frames within the video stream.
Figure 4 is an illustration of the different encoded video streams 404 generated by the video stream encoder 216, according to an embodiment of the invention. As shown, each encoded video stream 404 is associated with the same title and includes the scenes 304 illustrated in Figure 3. Additionally, each encoded video stream 404 is encoded to a different bit rate. The 404(0) encoded video stream is encoded to a bit rate that is lower than the bit rate of the 404(1) encoded video stream. Similarly, the 404(1) encoded video stream is encoded for a bit rate that is lower than the 404(2) encoded video stream bit rate. As also shown, the allocation of bits in each scene 304 in each encoded video stream 404 is constant. For example, each scene 304 within the encoded video stream 404(0) has a bit allocation identified by the bit allocation 406. Similarly, each scene 304 within the encoded video stream 404(1) has a bit allocation identified by bit allocation 408, and each scene 304 within encoded video stream 404(2) has a bit allocation identified by bit allocation 410. Importantly, bit allocations 406, 408, and 410 vary according to the bit rate associated with the corresponding 404 encoded video stream, where the 406 bit allocation is less than the 408 bit allocation and the 408 bit allocation is less than the 410 bit allocation.
Figure 5 is a more detailed view of the content player 108 of Figure 1, in accordance with an embodiment of the invention. As shown, the content player 108 includes, without limitation, a central processing unit (CPU) 510, a graphics subsystem 512, an input/output device (I/O) interface 514, a network interface 518 , an interconnect 520 and a memory subsystem 530. The content player 108 may also include a mass storage unit 516.
CPU 510 is configured to retrieve and execute programming instructions stored in memory subsystem 530. Similarly, CPU 510 is configured to store and retrieve application data residing in memory subsystem 530. Interconnect 520 is configured to promote transmission of data, such as programming instructions and application data, between CPU 510, graphics subsystem 512, input/output device interface 514, mass storage 516, network interface 518, and memory subsystem 530.
The graphics subsystem 512 is configured to generate video data frames and transmit the video data frames to the display device 550. In one embodiment, the graphics subsystem 512 may be integrated into an integrated circuit along with the CPU. 510. Display device 550 may comprise any technically feasible device for generating an image for display. For example, the 550 display device can be manufactured using liquid crystal display (LCD) technology, cathode ray technology, and light emitting diode (LED) (organic or inorganic) display technology. An input/output (I/O) device interface 514 is configured to receive input data from user input/output devices 552 and transmit the input data to CPU 510 via interconnect 520. For example, User input/output devices 552 may comprise one or more keys, a keyboard and a mouse or other pointing device. The input/output device interface 514 also includes an audio output unit configured to generate an electrical audio output signal. The 552 user input/output devices include a speaker configured to generate an acoustic output in response to the electrical audio output signal. In alternative embodiments, the display device 550 may include the speaker. A television is an example of a device known in the art that can display video frames and generate an acoustic output. A 516 mass storage unit, such as a hard disk drive or flash memory storage unit, is configured to store non-volatile data. A network interface 518 is configured to transmit and receive data packets over the communications network 150. In one embodiment, the network interface 518 is configured to communicate using the well-known Ethernet standard. Network interface 518 is coupled to CPU 510 via interconnect 520.
Memory subsystem 530 includes instructions and programming data comprising an operating system 532, user interface 534, and playback application 536. Operating system 532 performs system management functions such as managing hardware devices including network interface 518, mass storage unit 516, input/output device interface 514, and graphics subsystem 512. Operating system 532 also provides process and memory management templates for user interface 534 and for the playback application 536. The user interface 534 provides a specific framework, such as a window and object metaphor, for user interaction with the content player 108. Those skilled in the art will recognize the various operating systems and user interfaces. that are well known in the art and suitable for incorporation into the content player 108.
Playback application 536 is configured to retrieve digital content from CDN 106 via network interface 518 and play the digital content via graphics subsystem 512. Graphics subsystem 512 is configured to transmit a rendered video signal to the display device 550. In normal operation, playback application 536 receives a request from a user to play a specific title. Playback application 536 then identifies the different encoded video streams associated with the requested title, wherein each encoded video stream is encoded to a different playback bit rate. After the playback application 536 has located the encoded video streams associated with the requested title, the playback application transfers sequence header indices associated with each encoded video stream associated with the requested title from the CDN 106. As described earlier in this document, a sequence header index associated with an encoded video stream includes information related to the encoded sequence included in the digital content file.
In one embodiment, playback application 536 begins by transferring the digital content file associated with the requested title comprising the encoded sequence for the lowest playback bit rate to minimize start time for playback. For discussion purposes only, the digital content file is associated with the requested title and comprises the encoded sequence for the lowest playback bit rate. The requested digital content file is transferred to content staging 543, configured to serve as a first-in, first-out queue. In one embodiment, each transferred data unit comprises a video data unit or an audio data unit. The video data units associated with the requested digital content file are transferred to the content player 108, the video data units being pushed to the content staging 543. Similarly, the associated audio data units with the requested digital content file are transferred to the content player 108, the audio data units being pushed into the content staging 543. In one embodiment the video data units are stored in the video staging 546 within content staging 543, and audio data units are stored in audio staging 544, also within content staging 543.
A video decoder 548 reads video data units from the video buffer 546, and renders the video data units to a sequence of video frames corresponding in duration to the fixed playback time period. Reading a unit of video data from video staging 546 effectively dequeues the video data unit from video staging 546 (and content staging 543). The video frame sequence is processed by graphics subsystem 512 and transmitted to display device 550.
An audio decoder 542 reads audio data units into audio buffer 544, and renders the audio data units to a sequence of audio samples, generally synchronized in time with the sequence of video frames. In one embodiment, the audio sample sequence is transmitted to input/output device interface 514, which converts the audio sample sequence to the electrical audio signal. The electrical audio signal is transmitted to the speaker within the user input/output devices 552, which in response generates an acoustic output.
Given the bandwidth limitations of the communications network 150, the playback application 536 may transfer consecutive portions of video data from different constant bit rate encoded video streams based on scene complexities. In operation, when playback is initiated, the playback application 536 receives the scene complexity map 302 associated with the digital video being played back. As described above, scene complexity map 302 specifies the level of complexity of different digital video scenes. When selecting a next piece of video data to transfer, the playback application 536 determines the complexity level of the scene(s) included in the video data piece based on the scene complexity map 302. Based on the level of complexity of the scene(s) and one or more performance factors, the playback application 536 then determines the particular encoded video stream from which to transfer the portion of the video data. For example, in a scenario where the available bandwidth is low, if the scene(s) are of low complexity, then the playback application 536 transfers the portion of video data including the scenes from a video stream encoded for low bitrate. In this way, bandwidth of the communications network 150 can be effectively managed by the playback application 536 to transfer subsequent parts of the higher bitrate encoded video stream to scenes of greater complexity. In such a scenario, less bandwidth is used to transfer low-complexity scenes compared to medium-complexity scenes, and bandwidth is advantageously conserved in order to transfer portions of medium- or high-rate encoded video streams. bits for parts of the video data including highly complex scenes. In contrast, a conventional content player simply selects one of the variable bitrate encoded video streams based on the available bandwidth, without regard to the complexity of the scene that is encoded in that particular part of the variable bitrate video stream. .
Other performance factors, in addition to the levels of complexity of scenes included in a piece of video data, that can influence the specific encoded stream from which to transfer the piece of video data include levels of complexity of subsequent scenes of the video data. , the staging size of the video staging 546, the behavior of the end user viewing the video content, the type of display being generated (high definition, standard definition, etc.), and the available runtime. These factors combined with the bandwidth limitations of the communications network 150 can be used to determine a specific encoded video stream from which to transfer each piece of video data based on the levels of complexity of the scenes included in the piece of video data. . In such a mode, a variable bitrate video stream is generated from different constant bitrate encoded video streams.
In an alternative embodiment, only portions of video data that include high-complexity scenes are encoded to a high bit rate. Similarly, only portions of video data that include scenes of medium or high complexity are encoded to a medium bit rate. Portions of video data that include only low-complexity scenes are encoded only at a low bit rate. Referring again to Figure 4, the medium bitrate encoded level of the video stream, the encoded video stream 404(1), would not include the scene 304(1) and the high bitrate encoded level of the video stream, encoded video stream 404(2), would not include scene 304(0), 304(1) and 304(4). In such an embodiment, the playback application 536 may transfer only portions of video data including high-complexity scenes from video streams encoded at high bit rates and all other portions of video data from video streams encoded at high bit rates. of lower bits.
Fig. 6 is a flowchart of method steps for selecting a next scene for playing one of a plurality of video streams based on scene complexity, in accordance with an embodiment of the invention. While the method steps are described in combination with the systems for Figures 1-5, those skilled in the art will understand that any system configured to perform the method steps, in any order, is within the scope of the invention.
In step 602, the playback application 536 receives the scene complexity map 302 associated with the digital video for which playback has been initiated. As described above, scene complexity map 302 specifies the level of complexity of different digital video scenes. In step 604, the playback application 536 identifies a set of encoded video streams associated with the digital video to be played. Each encoded video stream is encoded to a different bit rate as described above in combination with figure 3.
In step 606, playback application 536 determines, for a next part of the video data, the level of complexity associated with the scene(s) included in the next part. The level of complexity is determined based on the scene complexity map 302 received in step 602. In step 608, the playback application 536 then selects a specific encoded video stream to transfer the next part of the video data based on the level. of determined complexity as well as one or more performance factors. As described above, performance factors can include bandwidth limitations and the size of the content staging 543. In order to select the specific encoded video stream, the playback application 536 running on the content player 108 dynamically determines the encoding level (high, medium, or low bit rate) of the video stream for the next part of the video data to be transferred when playing a different (previous) part of the digital video content.
At step 610, playback application 536 determines whether another time interval occurs during playback of the video data, and if so, then playback application 536 repeats steps 606 and 608 for another part of the video stream. When another time interval does not occur during video data playback, video content playback is completed. The time lag can occur at a constant rate (in seconds or frames) or be activated based on a filling or emptying of the 543 content temporary store.
An advantage of the disclosed technique is that a variable bit encoded stream is dynamically generated at playback time by selecting video data parts from different constant bit rate encoded streams based on the levels of complexity of the video data parts. A technique like this allows you to optimize the playback of video data and generate the highest playback quality video stream based on current conditions and scene complexities.
One embodiment of the invention may be implemented as a program product stored on computer readable storage media within the content player 108. In this embodiment, the content player 108 comprises an embedded computer platform such as a signal converter apparatus. An alternative embodiment of the invention may be implemented as a program product that is transferred to memory within a computer system, for example, as executable instructions embedded in an Internet website. In this embodiment, the content player 108 comprises the computer system.
While the foregoing is directed to embodiments of the present invention, other and additional embodiments of the invention can be imagined without departing from the basic scope of the invention. For example, aspects of the present invention may be implemented in hardware or software or in a combination of hardware and software. One embodiment of the invention may be implemented as a program product for use with a computer system. The program product program(s) define functions of the modalities (including the methods described in this document) and may be contained on a variety of computer-readable storage media. Illustrative computer-readable storage media include, but are not limited to: (i) non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM discs readable by a CD drive -ROM, flash memory, ROM chips or any type of non-volatile solid-state semiconductor memory) in which information is permanently stored; and (ii) writable storage media (eg, floppy disks within a floppy disk drive or hard disk drive or any type of solid-state random access semiconductor memory) on which changeable information is stored. Such computer readable storage media, when carrying computer readable instructions directing functions of the present invention, are embodiments of the present invention.
In view of the foregoing, the scope of the present invention is determined by the claims that follow.

权利要求:
Claims (11)
[0001]
1. Method for transferring digital video content including a sequence of parts, each part comprising multiple frames of video data, the method characterized in that it comprises: receiving a scene complexity map associated with the digital video content and specifying a level of complexity associated with each part, in the sequence of parts, of the digital video content; identifying a plurality of encoded video streams associated with the digital video content, wherein each encoded video stream is associated with a different bit rate and includes a portion encoded for the different bit rate for each portion of the digital video content ; receive a header index sequence for each of the plurality of encoded video streams, each header index sequence specifying a switch point for each part in the part sequence, each switch point stores a playback offset associated with each part; determining, based on the scene complexity map, the level of complexity associated with a first part of the sequence of parts of the digital video content; dynamically determining during playback of a different part of the digital video content, based on the level of complexity and switching point associated with the first part, a first encoded video stream included in the plurality of encoded video streams from which to transfer a first part encoded corresponding to the first part of the digital video content; and transferring for playback the first encoded portion from the first encoded video stream to a temporary storage of content residing within a content player device; wherein one or more scrambled video streams included in the plurality of scrambled video streams have a lower bit rate than a first scrambled video stream included in the plurality of scrambled video streams; and wherein the first scrambled video stream does not include at least one scrambled portion having a complexity associated with an scrambled portion included in at least one of the one or more scrambled video streams.
[0002]
2. Method according to claim 1, characterized in that the sequence header further stores a data packet that includes a key frame for each part in each of the encoded video streams, wherein the indices of the sequence header sequence are generated by locating the keyframes for each part of the encoded video streams, wherein the keyframes are located based on a sequence start code specified in a sequence header included in the keyframes, wherein the playback offset of each part is determined based on a part location in the part sequence for each of the plurality of encoded video streams, wherein the first portion has a high level of complexity and the bit rate associated with the first encoded video stream is greater than the bit rate associated with at least one other encoded video stream included in the plurality of video streams.
[0003]
3. Method according to claim 1, characterized in that the first part has a low level of complexity, and the bit rate associated with the first encoded video stream is less than the bit rate associated with at least another encoded video stream included in the plurality of video streams.
[0004]
4. Method according to claim 1, characterized in that it additionally comprises the step of determining, based on the scene complexity map, the level of complexity associated with a second part of the sequence of parts that is subsequent in time to the first part of the digital video content.
[0005]
5. Method according to claim 4, characterized in that the first part of the digital video content has a level of complexity lower than the level of complexity of the second part, and the first encoded video stream is less than the bit rate associated with a second encoded video stream from which a second encoded part corresponding to the second part is transferred.
[0006]
6. Method according to claim 1, characterized in that the step of determining the first encoded video stream from which to transfer the first encoded part is additionally based on the size of the content buffer.
[0007]
Method according to claim 1, characterized in that the step of determining the first encoded video stream from which to transfer the first encoded part is additionally based on the bandwidth available to transfer the first encoded part.
[0008]
8. Method according to claim 1, characterized in that the step of determining the first encoded video stream from which to transfer the first encoded part is additionally based on a type of display being generated.
[0009]
Method according to claim 1, characterized in that the first part of the digital video content has a high level of complexity, wherein a second part of the sequence of parts has a lower level of complexity, and in that the first encoded video stream includes the first encoded part and does not include any encoded parts corresponding to the second part.
[0010]
10. Method according to claim 9, characterized in that a second encoded part corresponding to the second part is transferred from a second encoded video stream associated with a bit rate lower than the bit rate associated with the first encoded video stream; wherein the reproduction of the first encoded part and the second encoded part is synchronized using the respective switching points of the parts.
[0011]
11. System for transferring digital video content, the system characterized in that it comprises: one or more computer processors; and a memory containing a program which, when executed by one or more processors, performs an operation of adaptively transferring digital video content including a sequence of parts, each part comprising multiple frames of video data, the operation comprising: receiving a scene complexity map associated with the digital video content and specifying a level of complexity associated with each part, in the sequence of parts, of the digital video content; identifying a plurality of encoded video streams associated with the digital video content, wherein each encoded video stream is associated with a different bit rate and includes a portion encoded for the different bit rate for each portion of the digital video content ; receive a header index sequence for each of the plurality of encoded video streams, each header index sequence specifying a switch point for each part in the part sequence, each switch point stores a playback offset associated with each part; determining, based on the scene complexity map, the level of complexity associated with a first part of the sequence of parts of the digital video content; dynamically determining during playback of a different part of the digital video content, based on the level of complexity and switching point associated with the first part, a first encoded video stream included in the plurality of encoded video streams from which to transfer a first part encoded corresponding to the first part of the digital video content; and transferring for playback the first encoded part 5 from the first encoded video stream to a temporary content store; wherein one or more encoded video streams included in the plurality of encoded video streams are of lower bit rate than a first encoded video stream 10 included in the plurality of encoded video streams; and wherein the first scrambled video stream does not include at least one scrambled portion having a complexity associated with an scrambled portion included in at least one of the one or more scrambled video streams.

类似技术:

公开号 | 公开日 | 专利标题

BR112013013944B1|2022-01-25|Method and system for transferring digital video content

US10972772B2|2021-04-06|Variable bit video streams for adaptive streaming

US10123059B2|2018-11-06|Fast start of streaming digital media playback with deferred license retrieval

US9769505B2|2017-09-19|Adaptive streaming for digital content distribution

US9781183B2|2017-10-03|Accelerated playback of streaming media

BR112012002182B1|2020-12-08|METHOD FOR DETERMINING IF THE DIGITAL CONTENT DOWNLOADED FROM A CONTENT BUFFER FOR A CONTENT BUFFER CAN BE ACCESSED FOR THE CONTENT BUFFER REPRODUCTION AT A PRE-DETERMINED BIT RATE WITHOUT PROVISING A LITTLE BUFFERER EMPTY, THEREFORE A LOT OF BUFFITER DETAILS, THEREFORE LOT CONTENT PLAYER DEVICE

US20120281965A1|2012-11-08|L-cut stream startup

同族专利:

公开号 | 公开日

EP2649599A1|2013-10-16|

WO2012078655A1|2012-06-14|

KR101500892B1|2015-03-09|

CA2819716A1|2012-06-14|

MX2013006313A|2013-07-29|

US20120144444A1|2012-06-07|

US8689267B2|2014-04-01|

KR20130093675A|2013-08-22|

CA2819716C|2016-08-02|

BR112013013944A2|2016-09-27|

BR112013013944A8|2018-07-03|

JP2014502483A|2014-01-30|

EP2649599B1|2018-06-13|

JP5684920B2|2015-03-18|

DK2649599T3|2018-09-24|

EP2649599A4|2014-10-15|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US20020129374A1|1991-11-25|2002-09-12|Michael J. Freeman|Compressed digital-data seamless video switching system|

US20030014484A1|2000-11-09|2003-01-16|Arnon Netzer|Scheduling in a remote-access server|

NO315887B1|2001-01-04|2003-11-03|Fast Search & Transfer As|Procedures for transmitting and socking video information|

US7274661B2|2001-09-17|2007-09-25|Altera Corporation|Flow control method for quality streaming of audio/video/media over packet networks|

US20030185301A1|2002-04-02|2003-10-02|Abrams Thomas Algie|Video appliance|

US7844992B2|2003-09-10|2010-11-30|Thomson Licensing|Video on demand server system and method|

US7480701B2|2004-12-15|2009-01-20|Microsoft Corporation|Mixed-media service collections for multimedia platforms|

US20060136981A1|2004-12-21|2006-06-22|Dmitrii Loukianov|Transport stream demultiplexor with content indexing capability|

US7908627B2|2005-06-22|2011-03-15|At&T Intellectual Property I, L.P.|System and method to provide a unified video signal for diverse receiving platforms|

US7979885B2|2005-08-11|2011-07-12|Harmonic Inc.|Real time bit rate switching for internet protocol television|

US8607287B2|2005-12-29|2013-12-10|United Video Properties, Inc.|Interactive media guidance system having multiple devices|

US8914529B2|2007-01-22|2014-12-16|Microsoft Corporation|Dynamically adapting media content streaming and playback parameters for existing streaming and playback conditions|

US7802286B2|2007-07-24|2010-09-21|Time Warner Cable Inc.|Methods and apparatus for format selection for network optimization|

US7860996B2|2008-05-30|2010-12-28|Microsoft Corporation|Media streaming with seamless ad insertion|

ES2624910T3|2008-06-06|2017-07-18|Amazon Technologies, Inc.|Client side sequence switching|

US9167007B2|2008-06-06|2015-10-20|Amazon Technologies, Inc.|Stream complexity mapping|

US8396114B2|2009-01-29|2013-03-12|Microsoft Corporation|Multiple bit rate video encoding using variable bit rate and dynamic resolution for adaptive video streaming|

EP2257073A1|2009-05-25|2010-12-01|Canon Kabushiki Kaisha|Method and device for transmitting video data|US9456243B1|2003-06-06|2016-09-27|Arris Enterprises, Inc.|Methods and apparatus for processing time-based content|

US7515710B2|2006-03-14|2009-04-07|Divx, Inc.|Federated digital rights management scheme including trusted systems|

US8966103B2|2007-12-21|2015-02-24|General Instrument Corporation|Methods and system for processing time-based content|

WO2010080911A1|2009-01-07|2010-07-15|Divx, Inc.|Singular, collective and automated creation of a media guide for online content|

EP2507995A4|2009-12-04|2014-07-09|Sonic Ip Inc|Elementary bitstream cryptographic material transport systems and methods|

US8914534B2|2011-01-05|2014-12-16|Sonic Ip, Inc.|Systems and methods for adaptive bitrate streaming of media stored in matroska container files using hypertext transfer protocol|

KR101840008B1|2011-06-24|2018-05-04|에스케이플래닛 주식회사|High quality video streaming service system and method|

US9615126B2|2011-06-24|2017-04-04|Google Technology Holdings LLC|Intelligent buffering of media streams delivered over internet|

US8909922B2|2011-09-01|2014-12-09|Sonic Ip, Inc.|Systems and methods for playing back alternative streams of protected content protected using common cryptographic information|

US8964977B2|2011-09-01|2015-02-24|Sonic Ip, Inc.|Systems and methods for saving encoded media streamed using adaptive bitrate streaming|

US8935425B2|2011-10-05|2015-01-13|Qualcomm Incorporated|Switching between representations during network streaming of coded multimedia data|

US20130223509A1|2012-02-28|2013-08-29|Azuki Systems, Inc.|Content network optimization utilizing source media characteristics|

US9392304B2|2012-02-29|2016-07-12|Hulu, LLC|Encoding optimization using quality level of encoded segments|

US20140068097A1|2012-08-31|2014-03-06|Samsung Electronics Co., Ltd.|Device of controlling streaming of media, server, receiver and method of controlling thereof|

GB2505486B|2012-08-31|2016-01-06|Samsung Electronics Co Ltd|Streaming media|

US9191457B2|2012-12-31|2015-11-17|Sonic Ip, Inc.|Systems, methods, and media for controlling delivery of content|

US9313510B2|2012-12-31|2016-04-12|Sonic Ip, Inc.|Use of objective quality measures of streamed content to reduce streaming bandwidth|

US9906785B2|2013-03-15|2018-02-27|Sonic Ip, Inc.|Systems, methods, and media for transcoding video data according to encoding parameters indicated by received metadata|

US9094737B2|2013-05-30|2015-07-28|Sonic Ip, Inc.|Network video streaming with trick play based on separate trick play files|

EP2890075B1|2013-12-26|2016-12-14|Telefonica Digital España, S.L.U.|A method and a system for smooth streaming of media content in a distributed content delivery network|

US9866878B2|2014-04-05|2018-01-09|Sonic Ip, Inc.|Systems and methods for encoding and playing back video at different frame rates using enhancement layers|

WO2015200484A1|2014-06-24|2015-12-30|Thomson Licensing|Streaming and downloading of video content according to available bandwidth|

KR102212762B1|2014-09-17|2021-02-05|삼성전자주식회사|Codec and devices including the same|

US10567816B2|2015-04-30|2020-02-18|Comcast Cable Communications, Llc|Delivering content|

CN105100823B|2015-09-01|2019-03-12|京东方科技集团股份有限公司|A kind of processing method, device, encoder and the decoder of adaptive media business|

US10332534B2|2016-01-07|2019-06-25|Microsoft Technology Licensing, Llc|Encoding an audio stream|

法律状态:
2018-12-18| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2020-03-17| B15K| Others concerning applications: alteration of classification|Free format text: A CLASSIFICACAO ANTERIOR ERA: G08C 15/00 Ipc: H04N 21/2343 (2011.01), H04N 21/845 (2011.01) |

2020-03-17| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2021-08-17| B06A| Patent application procedure suspended [chapter 6.1 patent gazette]|

2021-11-09| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2022-01-25| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 06/12/2011, OBSERVADAS AS CONDICOES LEGAIS. PATENTE CONCEDIDA CONFORME ADI 5.529/DF, QUE DETERMINA A ALTERACAO DO PRAZO DE CONCESSAO. |

优先权:

申请号 | 申请日 | 专利标题

US12/961,375|US8689267B2|2010-12-06|2010-12-06|Variable bit video streams for adaptive streaming|

US12/961,375|2010-12-06|

PCT/US2011/063564|WO2012078655A1|2010-12-06|2011-12-06|Variable bit video streams for adaptive streaming|

[返回顶部]